Week 5.4 - The Hallucinated Citation Crisis

What We'll Cover

This is arguably the most important session of the week. AI tools can generate references that look completely real — correct formatting, plausible author names, real journal titles, convincing abstracts — but refer to papers that have never been published. This is not a rare edge case. Studies suggest that general-purpose LLMs hallucinate citations at alarming rates.

In late 2025, the problem hit the highest levels of academic publishing, with consequences that are still unfolding. Hallucinated citations were found in papers at one of the most prestigious machine learning conferences in the world, forcing a reckoning across the research community about the reliability of AI-assisted scholarship.

This session covers the scope of the problem, why it happens at a technical level, real-world case studies that should concern every researcher, and — most importantly — the verification habits you must build to protect your own work. If you take one thing away from this entire week, let it be this: never include a citation in your work that you have not independently verified exists.

An important note: While I was working on this section with Claude, it hallucinated a number of references. I went through everything to check for correctness and relevance.

📊 The Scale of the Problem

Before we look at why hallucination happens, let us be clear about how widespread the problem is. This is not a theoretical concern or a niche failure mode — it is a systemic issue that affects every researcher who uses generative AI tools.

⚖️ Legal Citations: 1 in 6 queries give rise to hallucinations

A Stanford HAI study (Magesh et al., 2024) found that even purpose-built legal AI tools from LexisNexis and Thomson Reuters hallucinated between 17% and 34% of the time on benchmarking queries — despite marketing claims of being "hallucination-free." An earlier Stanford study found general-purpose chatbots hallucinated between 58% and 82% of the time on legal queries.

📈 Academic Citations: 20–90%+

A 2024 comparative analysis in JMIR tested LLMs on systematic review citations and found hallucination rates of 39.6% for GPT-3.5, 28.6% for GPT-4, and a striking 91.4% for Google Bard. A 2025 study in JMIR Mental Health (Linardon et al.) found GPT-4o fabricated 19.9% of all citations across mental health literature reviews. Newer models are improving, but no general-purpose LLM has eliminated the problem.

🔬 Worse in Specialised Fields

The Linardon et al. (2025) study demonstrated this directly: for binge eating disorder (a less-studied topic), GPT-4o's citation fabrication rate jumped to 46%, compared to 17% for major depressive disorder (a well-studied topic). A 2025 study on bibliographic hallucination confirmed that citation accuracy correlates strongly with how frequently a paper appears in training data — highly cited papers are reproduced faithfully, while niche work is fabricated.

🔎 Partial Hallucinations

The GPTZero analysis of NeurIPS 2025 papers documented this pattern in detail: some hallucinated citations started from a real paper but made subtle changes — expanding an author's initials into a guessed first name, dropping or adding co-authors, or paraphrasing the title. These "uncanny valley" citations look entirely legitimate at first glance, making them harder to catch than completely fabricated references.

⚠️ A crucial distinction: These hallucination rates are for general-purpose LLMs (ChatGPT, Claude, Gemini) used without retrieval augmentation. Tools specifically designed for literature search — Semantic Scholar, Elicit, Consensus — have much lower rates because they search actual databases and return real records. The distinction between generating a citation and retrieving a citation matters enormously. When you need to find papers, use retrieval tools. When you need to analyse or synthesise papers you have already found, generative tools become useful. Do not confuse the two use cases.

🔬 Why It Happens

Understanding why LLMs hallucinate citations is essential for knowing when to trust them and when not to. The explanation is not complicated, but it does require shifting how you think about what these systems are doing.

LLMs are pattern completion systems, not knowledge databases. They generate text that is statistically likely to follow the prompt. A citation that "looks right" is generated because the model has learned the pattern of what citations look like — not because it has looked up a real paper. The model is, in a very real sense, autocompleting the format of a reference list.
The model has no internal database of papers. It cannot check whether "Smith et al. (2023)" actually exists in any journal. It generates what seems plausible based on patterns in its training data. There is no verification step, no lookup, no fact-checking — just next-token prediction.
Niche topics and pressure increase hallucination. Models are particularly likely to hallucinate when asked about niche topics, very recent papers, or when pressured to provide a specific number of references. Asking "Give me 10 references on this topic" virtually guarantees that some will be fabricated if the model does not have 10 real ones in its training data.
Partial hallucinations are especially dangerous. The model may combine real elements — a real author name, a real journal, a plausible-sounding title — into a citation that does not exist as a whole. Because each individual component checks out superficially, these fabrications are much harder to catch than completely invented references.

⚙️ Retrieval vs. Generation: The Critical Difference

Retrieval-based tools (Semantic Scholar, Elicit, Consensus, Google Scholar) work by searching an actual database of indexed papers. When they return a result, that result exists in the database. The paper is real. The metadata comes from the publisher or indexing service. These tools can have coverage gaps — they might not include every paper ever published — but they do not fabricate papers.

Generation-based tools (ChatGPT, Claude, Gemini when asked for references) produce text token by token without any database lookup. They generate text that follows the statistical patterns of what a citation should look like. There is no step in the process where the model checks a database to verify that the paper exists. The output may be correct (if the model's training data included that specific paper frequently enough), but there is no guarantee.

The practical implication: Use retrieval-based tools to find papers. Use generation-based tools to analyse and discuss papers you have already verified. This division of labour dramatically reduces the risk of hallucinated citations entering your work.

💥 The NeurIPS 2025 Scandal

The theoretical risk of hallucinated citations became a very real crisis in late 2025, when the problem was discovered at one of the most prestigious venues in computer science.

📰 Case Study: Hallucinated Citations at NeurIPS 2025

In late 2025 and early 2026, researchers and journalists discovered that over 100 hallucinated citations had made it through peer review and into papers published at NeurIPS 2025 — one of the most prestigious AI and machine learning conferences in the world.

Teams from institutions including Google, Harvard, Meta, and Cambridge were among those with papers containing fabricated references. These were not fringe submissions — they were papers from leading research groups at the top of the field.

The scandal exposed a systemic problem at every level of the publication pipeline. Authors were using AI tools to generate or polish their papers and not verifying the references. Peer reviewers — overwhelmed by the volume of submissions and often working under tight deadlines — were not checking whether cited papers actually existed. And the conference itself had no automated systems in place to catch fabricated references.

The discovery triggered a broader investigation that revealed similar problems at other major venues, including ICML and AAAI. The scope of the contamination was larger than anyone had initially feared.

The irony was not lost on anyone: papers about artificial intelligence were themselves corrupted by artificial intelligence. The very community that builds these models had failed to protect itself from their most well-known failure mode.

💡 The lesson for all of us: The NeurIPS scandal is a case study in what happens when verification fails at every level — the authors did not check, the reviewers did not check, and the conference did not have systems in place to catch fabricated references. If this can happen at the top of computer science, it can happen in any field. It is a lesson for all of us, not just the researchers involved. The responsibility for verification falls on you, the author. You cannot assume that peer review will catch what you miss.

🔄 The Pollution Loop

The problem does not end when a hallucinated citation is published. It gets worse, because published citations feed back into the systems that generate future citations. This creates a compounding problem that threatens the integrity of the scholarly record itself.

Hallucinated citations in published papers get indexed by databases. Once a fake reference appears in a published paper, it becomes part of the metadata that databases like Google Scholar and Semantic Scholar index. The fake citation now has an apparent provenance — it was referenced in a real, peer-reviewed paper.
AI models trained on updated data learn these fake citations as "real." When language models are trained or fine-tuned on new academic text, they absorb hallucinated citations as if they were genuine. The model becomes more confident about the fake paper's existence because it has now "seen" it referenced in a real publication.
Future AI outputs reference fake papers with more confidence. The next time someone asks a model about the topic, it may cite the hallucinated paper with greater certainty, because the paper now appears in its training data as a cited work. It may even generate a plausible abstract or summary of the fake paper.
This creates a self-reinforcing cycle of academic pollution. Each cycle makes the fake citation harder to detect and more deeply embedded in the literature. What started as a single hallucinated reference can, over time, become part of the accepted knowledge base of a field.

🚨 Why this matters now, not later: The pollution loop is why verification matters today, not at some hypothetical future point. Every hallucinated citation that makes it into a published paper degrades the ecosystem that all researchers — human and AI — depend on. Verification is not just good practice for your own work; it is a collective responsibility. When you verify your citations, you are protecting not only your own credibility but the integrity of the scholarly record that your entire field depends on.

✅ The Verification Toolkit

The good news is that hallucinated citations are almost always detectable — if you know what to look for and have a systematic verification process. The problem is not that fake citations are undetectable; the problem is that researchers are not checking. This section gives you the tools and habits to be someone who always checks.

📋 The Five-Point Citation Check

Before any AI-surfaced citation enters your work, run it through these five checks:

Does the paper exist? Search Google Scholar, Semantic Scholar, or the journal's website for the exact title.
Are the authors correct? Check the authors' actual publication lists on their institutional pages or Google Scholar profiles.
Is the year correct? Verify the publication year on the journal or preprint server.
Is the journal or venue real? Confirm it is not a hallucinated journal name or a predatory publication.
Does it actually say what the AI claims? Read at least the abstract. AI tools frequently misrepresent the conclusions of real papers. Read more of the paper if you are citing claims within the paper.

🔧 Tools for Verification

Google Scholar — Search the exact title in quotation marks. If it does not appear, the paper likely does not exist.
Semantic Scholar — Search by DOI, title, or author. Offers a clean API and good coverage.
CrossRef / doi.org — Enter a DOI to verify it resolves to a real paper. If the DOI does not resolve, the citation is fabricated.
Retraction Watch Database — Check whether a paper has been retracted after publication.
Scite.ai — See how a paper has been cited by others — whether supporting or contrasting its claims.

🎉 Red Flags for Hallucinated Citations

Author names that are very common (Smith, Zhang, Lee) combined with generic-sounding titles
Papers supposedly published in very prestigious journals that you cannot find online
Very round publication years (2020, 2023) for niche topics
Multiple citations with suspiciously similar formatting patterns or phrasing
Papers whose conclusions perfectly match what you asked the AI to support — real research is messier than that
DOIs that do not resolve when you enter them at doi.org

🛠️ Building Good Habits

Never include a citation you have not verified. This is the single most important rule. No exceptions.
Keep a verification log. Document your checking process. This protects you if questions arise later.
Use the right tool for the right job. Use retrieval-based tools (Elicit, Semantic Scholar, Consensus) for finding papers. Use generative tools (ChatGPT, Claude) for analysis and synthesis of papers you have already verified.
When in doubt, search for the DOI. The Digital Object Identifier is the most reliable way to verify a citation. If a paper has a real DOI that resolves, it almost certainly exists.
Be sceptical of convenience. If the AI gives you a perfectly formatted reference list that says exactly what you hoped, slow down and verify harder.

📚 Retraction Watch and Ongoing Monitoring

The academic community is not standing still. The hallucinated citation crisis has prompted a growing movement to develop better tools, policies, and practices for protecting the scholarly record.

📰 Retraction Watch

retractionwatch.com

Retraction Watch has become an essential resource for any researcher who cares about the integrity of the literature. Their database tracks retracted papers, and their blog covers emerging issues including AI-generated content in published research. Bookmark this site and check it regularly.

Automated detection tools are in development. Several research groups are building tools that can automatically flag potentially hallucinated citations by cross-referencing against actual publication databases. These tools are not yet widely deployed, but they represent an important line of defence.
Preprint servers are adding verification steps. Some preprint platforms are beginning to implement automated citation checks that flag references that cannot be resolved against known databases before a paper is posted.
Journal policies are evolving rapidly. Major publishers are updating their submission guidelines to require authors to declare AI tool usage and to verify all citations independently. Some journals are beginning to run automated citation checks as part of the editorial process.
Conference organisers are responding. In the wake of the NeurIPS scandal, major AI conferences are implementing new review guidelines that specifically address AI-generated content and citation verification.

💡 The bottom line: The ecosystem is adapting, but slowly. Automated tools for detecting hallucinated citations are improving, and policies are being updated. But right now, the primary line of defence is you. Your verification habits are what stand between a hallucinated citation and the published record. Do not wait for institutions to solve this problem — build the habits that protect your work today.

Key Takeaways

Hallucinated citations are common, not rare. General-purpose LLMs fabricate references at rates between 15% and 40% or higher. The citations look real — correct formatting, plausible authors, real-sounding journals — but the papers do not exist.
This happens because LLMs are pattern generators, not databases. They produce text that looks like a citation because they have learned what citations look like. There is no lookup step, no verification, no connection to any real database of papers.
The NeurIPS 2025 scandal proved this is not hypothetical. Over 100 hallucinated citations made it through peer review at one of the world's top conferences, from teams at leading institutions. If it can happen there, it can happen anywhere.
The pollution loop makes the problem worse over time. Hallucinated citations that are published get absorbed into databases and training data, making future hallucinations more confident and harder to detect.
This is a solvable problem if you build good habits. The Five-Point Citation Check, the right verification tools, and a commitment to never including an unverified citation — these habits will protect your work and contribute to the integrity of the scholarly record.
Use retrieval tools for finding, generative tools for thinking. Semantic Scholar, Elicit, and Consensus search real databases. ChatGPT and Claude generate text. Use each where it is strong, and do not ask a generative tool to do a retrieval tool's job.

👉 Up next: We move from problems to solutions — building a personalised research workflow using Claude, including custom prompts and skills that enforce citation verification as part of the process. The goal is to make verification automatic, not optional.